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AMENDMENT TO THE CLAIMS 

Please consider the claims as follows: 

1 . (Currently Amended) A system for processing an audio signal comprising: 

means for dividing the audio signal into segments, each segment representing a 
portion of the audio signal occumng in one of a succession of time intervals; 

means for detecting for each segment the presence of a fundamental frequency; 
means responsive to the detecting means for determining the voicing probability 
for each segment by computing a ratio between voiced and unvoiced components of the 
audio signal, the determining means comprising: 

means for windowing each segment of the audio signal; 
means for computing the spectrum of the windowed segment; 
means for computing correlation coefficients of each segment using at least 
the spectmm; and 

means for comparing the congelation coefficients with a voicing threshold for 
each segment; 

means for separating the signal in each segment into a voiced portion and an 
unvoiced portion on the basis of the voicing probability, wherein the voiced portion of 
the signal occupies the low end of the spectmm and the unvoiced portion of the signal 
occupies the high end of the spectrum for each segment; and 

means for separately encoding the voiced portion and the unvoiced portion of the 
audio signal, wherein the means for separatelv encodina further includes means for 
computing LPC coefficients for a speech seoment and means for transfomiinq LPC 
coefficients into line spectral freouencies fLSF) coefficients con-espondino to the LPC 
coefficients. 



2, (Original) The system of Claim 1, wherein the audio signal is a speech signal 
and the means for determining the voicing probability further comprises means for 
refining the fundamental frequency of each segment using at least the spectrum of the 
windowed segment, 

3. (Cancelled) The system of Claim 1. wherein the means for encoding comprises 
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means for computing LPC coefficients for a speech segment and means for 
transfonning LPC coefficients into line spectral frequencies (LSF) coefficients 
corresponding to the LPC coefficients. 

4. (Original) The system of Claim 1, wherein the means for computing the 
spectrum of the windowed segment comprises means for performing a Fast Fourier 
Transform (FFT) of the windowed segment. 

5. (Originai) The system of Claim 1, further comprising means for estimating the 
voicing threshold for each segment comprising: 

means for dividing the spectmm into a plurality of non-linear bands, where the 
Jow bands of the spectrum have a higher resolution than the high bands of the 
spectrum; 

means for evaluating at least one voice measurement for each of the plurality of 
bands, where the at least one voice measurement is the normalized correlation 
coefficients calculated in the frequency domain; 

means for computing the low band energy of the spectrum; 

means for computing an energy ratio between the energy of the high and low 
bands of the spectrum of a current segment and a previous segment and 

a multi-layer neural networt< classifier for receiving the nonnalized conflation 
coefficients of the low bands, the low band energy and the energy ratio. 

6. (Original) The system of Claim 1, further comprising means for spectrally 
estimating the audio signal comprising: 

means for calculating a complex spectnjm for each segment by using a window 
based on the fundamental frequency; 

means for spectrally modeling each segment using at least the complex 
spectrum, the fundamental frequency, and the voicing probability to obtain line spectral 
frequencies (LSF) coefficients and a signal gain of each segment. 

7. (Original) The system of Claim 6, wherein the means for calculating the complex 
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spectmm comprises means for applying a Fast Fourier Transform to the windowed 
segment. 

8. (Currently Amended) A system for processing an audio signal comprising: 
means for dividing the signal into segments, each segment representing a 

portion of the audio signal in one of a succession of time intervals; 

means for detecting for each segment the presence of a fundamental frequency; 

means responsive to the detecting means for determining the voicing probability 
for each segment by computing a ratio between voiced and unvoiced components of the 
audio signal; 

means for calculating a complex spectrum for each segment by using a window 
based on the fundamental frequency; 

means for spectrally modeling each segment using at least the complex 
spectmm, the fundamental frequency, and the voicing probability to obtain line spectral 
frequencies (LSF) coefficients and a signal gain of each segn>ent; 

means for separating the signal in each segment into a voiced portion and an 
unvoiced portion on the basis of the voicing probability, wherein the voiced portion of 
the signal occupies frie low end of the spectmm and the unvoiced portion of the signal 
occupies the high end of the spectmm for each segment; and 

means for separately encoding the voiced portion and the unvoiced portion of the 
audio signal , wherein the means for seoaratelv encoding further includes means for 
computing LPC coefficients for a speech segment and means for transfonming LPC 
coefficients into line spectral freouencies (LSF) coefficients corresponding to the LPC 
coefficients . 

9. (Original) The system of Claim 8, wherein the audio signal is a speech signal 
and the means for determining the voicing probability comprises means for refining the 
fundamental frequency of each segment using at least the spectmm of . the windowed 
segment. 

10. (Cancelled) The system of Claim 8, wherein the means for encoding comprises 
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means for cx)mputing LPC coefficients for a speech segment and means for 
transforming LPC coefficients into line spectral frequencies (LSF) coefficients 
corresponding to the LPC coefficients. 

11. (Original) The system of Claim 8, wherein the means for computing the 
spectrum of the windowed segment comprises means for performing a Fast Fourier 
Transform (FFT) of the windowed segment. 

12. (Original) The system of Claim 8, wherein the means for determining the voicing 
probability comprises: 

means for windowing each segment of the input signal; 
means for computing the spectrum of the windowed segment; 
means for computing correlation coefficients of each segment using at least the 
spectrum; and 

means for comparing the congelation coefficients with a voicing threshold for each 
segment 

13. (Original) The system of Claim 12, further comprising means for estimating the 
voicing threshold for each segment comprising: 

means for dividing the spectnjm into a plurality of non-linear bands, where the 
low bands of the spectrnm have a higher resolution than the high bands of the 
spectmm; 

means for evaluating at least one voice measurement for each of the plurality of 
bands, where the at least one voice measurement is the normalized correlation 
coefficients calculated in the frequency domain; 

means for computing the low band energy of the spectrum; 

means for computing an energy ratio between the energy of the high and low 
bands of the spectrum of a current segment and a previous segment; and 

a multi-layer neural network classifier for receiving the normalized correlation 
coefficients of the low bands, the low band energy and the energy ratio. 
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14. (Original) The system of Claim 8, wherein the means for calculating the complex 
spectrum comprises means for applying a Fast Fourier Transform to the windowed 
segment 

15. (Currently Amended) A system for processing an audio signal having a number 
of frames, the system comprising: 

an encoder comprising: 

first means for determining for each frame a ratio between voiced and 
unvoiced components of the audio signal on the basis of the fundamental 
frequency of each frame, the ratio being defined as a voicing probability, the 
means for detenmining the voicing probability comprising: 

means for windowing each frame of the input signal; 
means for computing the spectrum of the windowed frame; 
means for computing conrelation coefficients of each frame using at 
least the spectrum; and 

means for comparing the correlation coefficients with a voicing 
threshold for each segment; 
second means for determining at least a pitch period, a mid-frame pitch 
period. and/eF and a mid-frame voicing probability of the audio signal; and 

means for quantizing at least the pitch period, the voicing probability, the 
mid-frame pitch period, and/or and the mid-frame voicing probability. 

16. (Original) The system of Claim 15, further comprising a decoder comprising: 
means for unquantizing at least the pitch period, the voicing probability, the mid- 
frame pitch period, and/or the mid-frame voicing probability and providing at least one 
output; and 

means for analyzing the at least one output to produce a synthetic speech signal 
corresponding to the input audio signal. 

17. (Original) The system of Claim 15, further comprising means for estimating the 
voicing threshold for each segment comprising: 
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means for dividing the spectrum into a plurality of non-linear bands, where the 
low bands of the spectrum have a higher resolution than the high bands of the 
spectrum; 

means for evaluating at least one voice measurement for each of the plurality of 
bands, where the at least one voice measurement is the normalized correlation 
coefficients calculated in the frequency domain; 

means for computing the low band energy of the spectrum; 
means for computing an energy ratio between the energy of the high and low bands of 
the spectrum of a current segment and a previous segment; and 

means for receiving the nonnalized conflation coefficients of the low bands, the 
low band energy and the energy ratio. 

18. (Original) The system of Claim 17, wherein the means for receiving is a multi- 
layer neural network classifier. 

19. (Original) The system of Claim 18, wherein the voicing probability is zero if an 
output from the means for receiving is less than a predetermined threshold for a 
predetermined number of frames. 

20. (Original) The system of Claim 15, wherein further comprising means for high- 
pass filtering the audio signal and buffering the audio signal into the number of frames. 

21. (Original) The system of Claim 15, wherein the encoder further comprises 
spectral estimation means for computing an estimate of the power spectrum of the 
audio signal using a pitch adaptive window. 

22. (Original) The system of Claim 21, wherein the length of the pitch adaptive 
window is based on the fundamental frequency of the audio signal. 

23. (Original) The system of Claim 16, wherein the means for unquantizing 
comprises: 

324S03.1 Page 10 Of 22 



PACE 13/25 * RCVD AT 3/7/200$ 3:07:12 PM (Eastern standard Time] ' SVR:USPTO-EFXRF-1/2 • DNIS:8729306* csn>:732 $30 9808 ' DURATION (cnfn.ss):104>2 



03/07/2005 15:12 FAX 732 530 9808 



MOSER PATTERSON SHERIDAN ■* PTO 



121014/025 



SeriatNo, 09/625,960 (Atty. Docket No. AguUar 24-1-1 (LCNT/1 22485}) 

Amendment dateii March 7, 2005 

Repfy of Office Action of Oecember 7, 2004 

means for producing a spectral magnitude envelope and a minimum phase 
envelope using at least the unquantized pitch period, the unquantized voicing 
probability, the unquantized mid-frame pitch period, and/or the unquantized mid-frame 
voicing probability; 

means for interpolating and outputting the spectral magnitude envelope and the 
minimum phase envelope to the means for analyzing; 

means for estimating the signal-to-noise ratio of the audio signal using the at 
least the unquantized pitch period, the unquantized voicing probability, the unquantized 
mid-frame pitch period, and/or the unquantized mid-frame voicing probability; and 

means for generating at least one control parameter using at least the signal-to- 
noise ratio and for outputting the at least one control parameter to the means for 
analyzing. 

24. (Original) The system of Claim 16, wherein the means for analyzing comprises: 
first means for processing the at least one output to produce a time-domain 

signal; and 

second means for processing the time-domain signal to produce the synthetic 
speech signal corresponding to the audio signal. 

25. (Oilginal) The system of Claim 24, wherein the first means for processing the at 
least one output to produce the time-domain signal comprises: 

means for filtering a spectral magnitude envelope, wherein the spectral 
magnitude envelope is outputted by the means for unquantizing; 

means for calculating frequencies and amplitudes using at least the filtered 
spectral magnitude envelope; 

means for calculating sine-wave phases using at least the calculated 
frequencies; and 

means for calculating a sum of sinusoids using at least the calculated 
frequencies and amplitudes and the sine-wave phases to produce the time-domain 
signal. 
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26, (Original) The system of Claim 1 5, further comprising: 

means for calculating a complex spectrum for each segment by using a window 
based on the fundamental frequency; and 

means for spectrally modeling each segment using at least the complex 
spectrum, the fundamental frequency, and the voicing probability to obtain line spectral 
frequencies (LSF) coefficients and a signal gain of each segment. 

27. (Original) The system of Claim 26, wherein the means for calculating the 
complex spectrum comprises means for applying a Fast Fourier Transfonn to the 
windowed segment 

28, (Currently Amended) A system for processing an audio signal having a number 
of frames, the system comprising: 

an encoder comprising; 

means for detemnining for each frame a ratio between voiced and 
unvoiced components of the audio signal on the basis of the fundamental frequency of 
each frame, the ratio being defined as a voicing probability; 

means for calculating a complex spectrum for each segment by using a 
window based on the fundamental frequency; 

means for spectrally modeling each segment using at least the complex 
spectrum, the fundamental frequency, and the voicing probability to obtain line spectral 
frequencies (LSF) coefficients and a signal gain of each segment; 

means for determining at least a pitch period, a mid-frame pitch period, 
af»d/of and a mid-frame voicing probability of the audio signal; and 

means for quantizing at least the pitch period, the voicing probability, the 
mid-frame pitch period, and/or and the mid-frame voicing probability. 

29. (Original) The system of Claim 28, further comprising a decoder comprising: 
means for unquantizing at least the pitch period, the voicing probability, the mid- 
frame pitch period, and/or the mid-frame voicing probability and providing at least one 
output; and 
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means for analyzing the at least one output to produce a synthetic speech signal 
conresponding to the input audio signal. 

30. (Original) The system of Claim 28, further comprising means for estimating the 
voicing threshold for each segment comprising: 

means for dividing the spectrum into a plurality of non-linear bands, where the 
low bands of the spectrum have a higher resolution than the high bands of the 
spectrum; 

means for evaluating at least one voice measurement tor each of the plurality of 
bands, where the at least one voice measurement is the nonnalized correlation 
coefficients calculated in the frequency domain; 

means for computing the low band energy of the spectrum; 

means for computing an energy ratio between the energy of the high and low 
bands of the spectrum of a current segment and a previous segment; and 

means for receiving the normalized correlation coefficients of the low bands, the 
low band energy and the energy ratio. 

31. (Original) The system of Claim 30, wherein the means for receiving is a multi- 
layer neural network classifier. 

32. (Original) The system of Claim 31, wherein the voicing probability is zero if an 
output from the means for receiving is less than a predetermined threshold for a 
predetermined number of frames. 

33. (Original) The system of Claim 28, further comprising means for high-pass 
filtering the audio signal and buffering the audio signal into the number of frames. 

34. (Original) The system of Claim 28, wherein the encoder further comprises 
spectral estimation means for computing an estimate of the power spectmm of the 
audio signal using a pitch adaptive window. 
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35. (Original) The system of Claim 34, wherein the length of the pitch adaptive 
window is based on the fundamental frequency of the audio signal. 

36. (Original) The system of Claim 29, wherein the means for unquantizing 
comprises: 

means for producing a spectral magnitude envelope and a minimum phase 
envelope using at least the unquantized pitch period, the unquantized voicing 
probability, the unquantized mid-frame pitch period, and/or the unquantized mid-frame 
voicing probability; 

means for interpolating and outputting the spectral magnitude envelope and the 
minimum phase envelope to the means for analyzing; 

means for estimating the signaMo-noise ratio of the audio signal using the at 
least the unquantized pitch period, the unquantized voicing probability, the unquantized 
mid-frame pitch period, and/or the unquantized mid-frame voicing probability; and 

means for generating at least one control parameter using at least the signal-to- 
noise ratio and for outputting the at least one control parameter to the means for 
analyzing. 

37. (Original) The system of Claim 29, wherein the means for analyzing comprises: 
first means for processing the at least one output to produce a time-domain 

signal; and 

second means for processing the time-domain signal to produce the synthetic 
speech signal corresponding to the audio signal, 

38. (Orlginai) The system of Claim 37, wherein the first means for processing the at 
least one output to produce the time-domain signal comprises: 

means for filtering a spectral magnitude envelope, wherein the spectral 
magnitude envelope is outputted by the means for unquantizing; 

means for calculating frequencies and amplitudes using at least the filtered 
spectral magnitude envelope; 

means for calculating sine-wave phases using at least the calculated 
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frequencies; and 

means for calculating a sum of sinusoids using at least the calculated 
frequencies and amplitudes and the sine-wave phases to produce the time-domain 
signal. 

39. (Original) The system of Claim 28, wherein the means for determining the 
voicing probability comprises: 

means for windowing each frame of the input signal; 
means for computing the spectrum of the windowed frame; 
means for computing correlation coefficients of each frame using at least the 
spectrum; and 

means for comparing the correlation coefficients with a voicing threshold for each 
segment. 

40. (Original) The system of Claim 28, wherein the means for calculating the 
complex spectrum comprises means for applying a Fast Fourier Transform to the 
windowed segment, 

41. (Withdrawn) A system for processing an audio signal having a number of 
frames, the system comprising: 

a decoder comprising: 

means for unquantizing at least a pitch period, a voicing probability, a mid-frame 
pitch period, and/or a mid-frame voicing probability of the audio signal and providing at 
least one output, where the means for unquantizing comprises means for generating at 
least one control parameter using at least the signal-to-noise ratio computed using a 
gain and the voicing probability of the audio signal; and 

means for analyzing the at least one output, including the at least one control 
parameter, to produce a synthetic speech signal corresponding to the input audio 
signal. 

42. (Withdrawn) The system of Claim 41, wherein the means for unquantizing 
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comprises: 

means for producing a spectral magnitude envelope and a minimum phase 
envelope using at least the unquantlzed pitch period, the unquantized voicing 
probability, the unquantized mid-frame pitch period, and/or the unquantized mid-frame 
voicing probability; 

means for interpolating and outputting the spectral magnitude envelope and the 
minimum phase envelope to the means for analyzing; and 

means for estimating the signal-to-noise ratio of the audio signal using the at 
least the unquantized pitch period, the unquantized voicing probability, the unquantized 
mid-frame pitch period, and/or the unquantized mid-frame voicing probability and 
outputting the signal-to-noise ratio to the means for generating at least one control 
parameter. 

43. (Withdrawn) The system of Claim 41, wherein the means for analyzing 
comprises: 

first means for processing the at least one output to produce a time-domain 
signal; and 

second means for processing the time-domain signal to produce the synthetic 
speedi signal corresponding to the audio signal. 

44. (Withdrawn) The system of Claim 43, wherein the first means for processing the 
at least one output to produce the time-domain signal comprises: 

means for filtering a spectral magnitude envelope, wherein the spectral 
magnitude envelope is outputted by the means for unquantizing; 

means for calculating frequencies and amplitudes using at least the filtered 
spectral magnitude envelope; 

means for calculating sine-wave phases using at least the calculated 
frequencies; and 

means for calculating a sum of sinusoids using at least the calculated 
frequencies and amplitudes and the sine-wave phases to produce the time-domain 
signal. 
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